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PERFECT SIMULATION FOR A CLASS OF POSITIVE 
RECURRENT MARKOV CHAINS 

By Stephen B. Connor and Wilfrid S. Kendall 

University of Warwick 

This paper generalizes the work of Kendall [Electron. Comm. 
Probab. 9 (2004) 140-151], which showed that perfect simulation, in 
the form of dominated coupling from the past, is always possible (al- 
though not necessarily practical) for geometrically ergodic Markov 
chains. Here, we consider the more general situation of positive re- 
current chains and explore when it is possible to produce such a 
simulation algorithm for these chains. We introduce a class of chains 
which we name tame, for which we show that perfect simulation is 
possible. 

1. Introduction. Perfect simulation was first introduced by Propp and 
Wilson [20] as a method for sampling from the exact stationary distribution 
of an ergodic Markov chain. Foss and Tweedie [7] showed that this classic 
coupling from the past {CFTP) algorithm is possible (in principle, if not in 
practice) if and only if the Markov chain is uniformly ergodic. 

More recently, Kendall [14] showed that all geometrically ergodic chains 
possess (again, possibly impractical) dominated CFTP algorithms (as intro- 
duced in [13, 16]). This suggests the questions: what if X is subgeometrically 
ergodic? Might it be the case that all positive recurrent Markov chains pos- 
sess (impractical) domCFTP algorithms? 

In this paper, we introduce a new class of positive-recurrent chains {tame 
chains) for which domCFTP is possible in principle. 

Note that the practicality of CFTP algorithms is subject to a number 
of interesting constraints: methods using coadapted coupling will deliver 
answers at a slower exponential rate than ordinary Markov chain Monte 
Carlo for many chains [1, 19]; in general, the coalescence of paths from many 
different starting states (an intrinsic feature of CFTP) may be expected to be 
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slower than pairwise coupling; finally, the theory of randomized algorithms 
can be used to demonstrate the existence of problems for which there will not 
even be any fully-polynomial randomized approximation schemes (subject to 
the complexity theory assumption RP ^ NP; Jerrum [12] discusses results 
of this nature for counting algorithms for independent sets). 

Considerations of the practicality of CFTP raise many further interesting 
research questions; however, in this paper, we focus on considering whether 
(for all Markov chains with a specified property) there can exist domCFTP 
algorithms, practical or not. 

To make this a meaningful exercise, it is necessary to be clearer about 
what one is allowed to do as part of an impractical algorithm. The [7] result 
for uniform ergodicity presumes that one is able to identify when regen- 
eration occurs for the target Markov chain subsampled every k time steps 
(where k is the order of the whole state space considered as a small set of 
the chain) and that one can then draw from the regeneration distribution 
and the fc-step transition probability distribution conditioned on no regen- 
eration. One must assume more in order to cover the geometrically ergodic 
case [14], namely, that it is possible to couple the target chain and the domi- 
nating chain when subsampled every k time steps, preserving the domination 
while so doing. Here, k is the order of a small set for a particular Foster- 
Lyapunov criterion for the geometric ergodicity property. In fact, something 
more must also be assumed: it must be possible to implement the coupling 
between target chain and dominating process in a monotonic fashion, even 
when conditioning on small-set regeneration occurring or not occurring. In 
fact, we do not need to assume any more than this when dealing with the 
tame chains introduced below, except that the subsampling order k is now 
not fixed for all time, but can vary according to the current value of the 
dominating process. 

The impracticality of these CFTP algorithms thus has two aspects. First, 
the question of expected run time is not addressed at all. Second, for the 
most part, the assumptions described above amount to supposing that we 
can translate into practice the theoretical possibility of implementing vari- 
ous stochastic dominations as couplings (guaranteed by theory expounded 
in, e.g., [17], Chapter IV). However, it should be noted that practical and 
implemented CFTP algorithms can correspond very closely to these gen- 
eral schemes. For example, the CFTP algorithm resulting from the result of 
Foss and Tweedie [7] is essentially the simplest case of the exact sampling 
algorithm proposed by Green and Murdoch [9]; the scheme proposed in [14] 
is closely related to fast domCFTP algorithms for perpetuities with sample 
step k = 1. 

In this paper, we investigate the problems that occur in the move from 
geometric to subgeometric ergodicity. We begin by recalling some useful re- 
sults concerning rates of ergodicity. Section 2 then reviews the result of [14]. 
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The bulk of the new material in this paper is to be found in Section 3. There, 
we introduce the notion of a tame chain (Definition 14) and demonstrate 
that domCFTP is possible for such chains (Theorem 15). A description of 
the domCFTP algorithm for tame chains is provided in Section 3.3; the 
reader is referred to [15] for an introduction to the classical form of dom- 
CFTP. We also prove some sufficient conditions for a polynomially ergodic 
chain to be tame (Theorems 21 and 22). However, these conditions are not 
necessary; Section 4.4 contains an example of a polynomially ergodic chain 
which does not satisfy these conditions and yet is still tame. The existence of 
a polynomially ergodic chain that is not tame is currently an open question. 

1.1. Definitions and notation. Let X = (Xq,Xi, . . .) be a discrete-time 
Markov chain on a Polish state space X. The Markov transition kernel for 
X is denoted by P and the ra-step kernel by P n : 

P n (x,E)=F x [X n eE], 

where ¥ x is the conditional distribution of the chain given Xq = x. The 
corresponding expectation operator will be denoted by E x . If g is a nonneg- 
ative function, then we write Pg{x) for the function J g(y)P(x,dy) and for 
a signed measure /i, we write fi(g) for / g(y)fi(dy). The /-norm is defined as 
\\n\\f := sup 5: | 5 |<j taking / = 1 yields the usual total variation norm, 

for which we will simply write \\fJ.\\. 

We assume throughout that X is aperiodic (in the sense of [18]) and 
Harris-recurrent. The stationary distribution of X shall be denoted by n and 
the first hitting time of a measurable set A C X by ta = min{n > 1 : X n G ^4}. 

The notion of small sets will feature heavily throughout this paper. 

Definition 1 . A subset C C X is a small set (of order m) for the Markov 
chain X if the following minorization condition holds: for some e € (0, 1] and 
a probability measure u, 

(1) F x [X m € E] > eu{E) for all x £ C and measurable E C X. 

In this case, we say that C is m-small. Many results in the literature are 
couched in terms of the more general idea of petite sets; however, for ape- 
riodic (^-irreducible chains, the two notions are equivalent ([18], Theorem 
5.5.7). Small sets allow the use of coupling constructions: specifically, if X 
hits the small set C at time n, then there is a positive chance (e) that it re- 
generates at time n + m (using the measure v). Furthermore, if regeneration 
occurs, then a single draw from v may be used for any number of copies of 
X belonging to C at time n, resulting in their coalescence at time n + m. 
Small sets belong to a larger class of pseudo-small sets, as introduced in [22], 
but such sets only allow for the coupling of pairs of chains. Implementation 
of domCFTP requires a positive chance of a continuum of chains coalescing 
when belonging to a given set C, so we shall henceforth deal solely with 
small sets. 
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1.2. Geometric ergodicity. We first outline some relevant theory for ge- 
ometrically ergodic chains. 

Definition 2. The chain X is said to be geometrically ergodic if there 
exists a constant 7 E (0, 1) and some function A:X — ► [0, 00) such that for 
all x in a full and absorbing set, 

(2) \\P n (x,-)-7T(.)\\<A(x) 1 n . 

If A can be chosen to be bounded, then X is said to be uniformly ergodic. 

Uniform ergodicity of X can be shown to be equivalent to the whole state 
space X being a small set, in which case at every Markov chain step, there 
is a positive chance of coalescence, whereby chains started at all elements of 
the state space become equal simultaneously. Foss and Tweedie [7] use this 
to show that uniform ergodicity is equivalent to the existence of a CFTP 
algorithm for X in the sense of Propp and Wilson [20] . 

The most common way to establish geometric ergodicity of a chain X is 
to check the following geometric Foster-Lyapunov condition [8]. 

Condition GE. There exist positive constants /3 < 1 and b < 00, a small 
set C and a scale function V: X — > [1, 00), bounded on C, such that 

(3) K[V(X n+1 )\X n = x}< (3V(x) + bl c (x). 

Inequality (3) will be referred to as GE(V, (3, b, C) when we need to be 
explicit about the scale function and constants. For simplicity, we will also 
often write inequality (3) as PV < j3V + blc- Under our global assumptions 
on X, this drift condition is actually equivalent to X being geometrically 
ergodic ([18], Theorem 15.0.1). Furthermore, if X satisfies (3), then we can 
take A = V in equation (2). 

Condition GE quantifies the way in which the chain V{X) behaves as a 
supermartingale before X hits C. When the chain hits C, it can then increase 
in expectation, but only by a bounded amount. The following result can be 
extracted from [18], Theorems 15.0.1 and 16.0.1. 

Theorem 3. Suppose that X is ^-irreducible and aperiodic. Then X is 
geometrically ergodic if and only if there exists n > 1 such that the corre- 
sponding geometric moment of the first return time to C is bounded, that 
is, 

(4) supExK ] < 00. 

The first hitting time of C is related to drift conditions in the following 
way (extracted from [18], Theorem 11.3.5). 
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Theorem 4. For an ergodic chain X , the function Vc(x) = K x [t~c] 
the pointwise minimal solution to the inequality 



Equation (5) is clearly a weaker drift condition than Condition GE and is 
equivalent to positive recurrence of X [18]. It can be shown that (5) implies 
that all sublevel sets are small [18] and since V is bounded on C, we will 
always take C to be a sublevel set of the form {x G X : V(x) < d}. 

We now present a couple of easy results concerning geometrically ergodic 
chains, which will prove to be of great importance later on. The first demon- 
strates how the scale function V in (3) may be changed to obtain a new drift 
condition using the same small set. 



Lemma 5. If the chain X satisfies Condition GE(V,/3,b, C), then for 
any £e (0,1], 



Thus GE(V, (3, b,C) implies GE(V«,/^,^,C). 

Proof. Calculus shows that (x + y)^ < x^ + y^ for x, y > and < £ < 1. 
The result follows by Jensen's inequality for (PVp, using (3). □ 

The second result shows that a geometric drift condition persists if we 
subsample the chain at some randomized stopping time. 

Lemma 6. Suppose X satisfies Condition GE(V, (3, b,C) . Then for any 
positive, integer-valued stopping time a (adapted to the natural filtration 
generated by X), we have 



The same j3, b\ and C\ work for all values of a since the constant b\ 
absorbs the higher-order terms in (3 below. 

PROOF of Lemma 6. Iterate the drift condition (3) and treat the cases 
{a = 1} and {a > 1} separately: 



(5) 



PV(x) < V{x) - 1 



x£C. 



E x [V(X a )} < (3V(x) +hl Cl (x), 
where h = 6/(1 - (3) and d = {x: V(x) < b/((3(l - (3) 2 )} U C. 



a 



E x [V(X a )]<E x (3 a V(x) + bY^P 3 ~ 



lc(^-i) 



< {(3V(x)+bl c {x))F x [o- 



<{(3V(x)+bl c {x))F x [o- 
<0V{x) + b 1 lc 1 {x). 



1]+^V(X) + T ^)P X .[CT>1] 

l] + ((3V(x) + b 1 l Cl (x))F x [a>l] 



□ 
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1.3. Polynomial ergodicity. We now turn to polynomially ergodic chains 
and state some results which will prove useful in Section 3.4. 

Definition 7. The chain X is said to be polynomially ergodic if there 
exists 7 > such that for all i in a full and absorbing set, 

(6) n<\\P n {x,-) -tt(-)||->0 asrwoo. 

As with geometric ergodicity, there is a Foster-Lyapunov drift condition 
that can be shown [11] to imply polynomial ergodicity (although the two 
are not equivalent in this case). 

Condition PE. There exist constants a € (0,1) and b, cG (0, oo), a 
small set C and a scale function V : X — ► [1, oo), bounded on C, such that 

(7) E[V(X n+1 )\X n = x}<V(x)-cV a (x)+bl c (x). 

We will refer to (7) as PE(V, c, a, b, C) when we need to be explicit about 
the scale function and constants. 

This drift condition again tells us that V(X) behaves as a supermartingale 
before X hits C, but that the drift toward the small set now occurs at a 
subgeometric rate (and hence tq has no exponential moment). Note that 
if a = 1, then we regain Condition GE [for c £ (0,1)] and that we do not 
include the case a = here, for which the drift condition is equivalent to X 
being simply positive recurrent. 

Polynomially ergodic chains satisfy a result analogous to Lemma 5, with 
a similar proof ([11], Lemma 3.5). 

Lemma 8. If the chain X satisfies Condition PE, then for any £ € (0, 1] , 
there exists < b\ < oo such that 

PV t <v^- czy a+ t~ l + hl c . 

Note that as in Lemma 5, the same small set C appears in the new drift 
condition when we change scale function in this way. 

Corollary 9. Suppose that X satisfies Condition PE. Then for x £ C, 

r i ^ V^ix) 
c(l — a) 

Proof. Set £ = 1 — a in Lemma 8 to obtain 

PV 1 - a (x)<V 1 ~ a {x)-c{l-a) iorx^C. 
The result then follows from Theorem 4. □ 



PERFECT SIMULATION AND POSITIVE RECURRENCE 



7 



Note, however, that there is no analogue to Lemma 6 (even if a is de- 
terministic), since the geometric ergodicity case makes essential use of the 
convergence of the series ft 3 ■ 

The drift condition (7) can actually be shown to imply much more than 
the convergence in (6). From Theorem 3.6 of [11] we obtain the following 
which will be used in the proof of Theorem 22. 



Proposition 10. Suppose X satisfies Condition PE. Define, for each 
l<p<l/(l-a), 

(8) V p {x) =V 1 - p{1 - a \x) and r p {n) = {n + lf~ l . 

Then there exists a constant M < oo such that 



(9) 



E, 



£ r p (n)V p (X n ) 

n=0 



<MV(x) 



Furthermore, from [4] we see that an upper bound for M can be obtained 
directly from the drift condition (7). 

2. Geometric ergodicity implies domCFTP. We now give a brief overview 
of the proof that all geometrically ergodic chains possess (not necessarily 
practical) domCFTP algorithms [14]. Recall that coadaptive coupling of 
Markov chains means that both chains have a common past expressed by a 
fixed filtration of cr-algebras. 



Definition 1 1 . Suppose that V is a scale function for a Harris-recurrent 
Markov chain X. We say that the stationary ergodic random process Y on 
[1, oo) is a dominating process for X based on the scale function V (with 
threshold h and coalescence probability e) if it can be coupled coadaptively 
to realizations of X x ~ l (the Markov chain X begun at x at time —t) as 
follows: 

(a) for all x £ X, n > and —t < 0, almost surely 

(10) V(XX+n)<Y- t+n =► ViXX+n+J^y-t+n+r, 

(b) if Y n < h, then the probability of coalescence at time n + 1 is at least 
e, where coalescence at time n + 1 means that the set 

(11) {XZtf :-t<n and V{X%~*) < Y n } 

is a singleton set; 

(c) F[Y n < h] is positive. 



The following theorem is the main result of [14]. 



8 



S. B. CONNOR AND W. S. KENDALL 



Theorem 12. If X satisfies the drift condition 

PV<PV + bl c 

for < (3 < 1, then there exists a domCFTP algorithm for X {possibly subject 
to subsampling) using a dominating process based on the scale V . 

The idea behind the proof of Theorem 12 is that a dominating process 
Y satisfying equation (10) may be obtained by using Markov's inequality 
and the geometric drift condition for X. The result is that any chain satis- 
fying Condition GE can be dominated by Y = (d + b/ (3) exp(C7), where U is 
the system workload of a D/M/l queue, sampled at arrivals, with arrivals 
every log(l//3) units of time and service times being independent and of 
unit-rate exponential distribution. U is positive recurrent only if < e , 
but a new geometric drift condition with replaced by j3 k ~ l can be pro- 
duced by subsampling X with a fixed subsampling period k\ the proof uses 
the ideas of Lemma 6. If k is chosen sufficiently large to fix (3 k ~ l < e _1 , 
then the above argument produces a stationary dominating process for the 
subsampled chain. 

Note that Y is easy both to sample from in equilibrium and to run in re- 
versed time, which is essential for implementation of domCFTP. Also, note 
that Y belongs to a family of universal dominating processes for geomet- 
rically ergodic chains, although this dominator need not generally lead to 
a practical simulation algorithm. As noted in the introduction, the main 
difficulties in application are in implementing practical domination and in 
determining whether or not regeneration has occurred when Y visits the set 
{Y < h}. This task is rendered even less practical if subsampling has taken 
place, since then, detailed knowledge of convolutions of the transition kernel 
for X is required. 

3. domCFTP for suitable positive recurrent chains. Theorem 12 leads 
to an obvious question: does there exist a similar domCFTP algorithm for 
chains not satisfying Condition GE? [Note that if we try to use the drift con- 
dition (7) — as in the proof of Theorem 12 — to produce a dominating process 
for polynomially ergodic chains, then the resulting process is nonrecurrent.] 
In this section, we introduce a class of chains which possess a domCFTP 
algorithm. 

The principal idea behind the subsequent work is to investigate when it is 
possible to subsample X to produce a geometrically ergodic chain. For non- 
geometrically ergodic chains, a fixed subsampling interval will not work and 
so we seek an appropriate simple adaptive subsampling scheme. A similar 
scheme can then be used to delay the dominating process Y constructed in 
Section 2 and to show that this new process D dominates the chain V(X) 
at the times when D moves. 
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Several issues must be addressed in order to derive a domCFTP algorithm 
using this idea: 

1. What is an appropriate adaptive subsampling scheme? 

2. When does such a scheme exist? 

3. How does the dominating process D dominate V(X) when D moves? 

4. Can we simulate D in equilibrium and in reversed time? 

The answers to these questions are quite subtle. 

3.1. Adaptive subsampling. We begin by defining more carefully what 
we mean by an adaptive subsampling scheme. 

Definition 13. An adaptive subsampling scheme for the chain X with 
respect to a scale function V is a sequence of stopping times {6 n } defined 
recursively by 



where F : [1, oo) — » {1, 2, . . .} is a deterministic function. 

Note that a set of stopping times {9 n } such that {Xg n } is uniformly 
ergodic can be produced as follows. Using the Athreya-Nummelin split-chain 
construction [18], we may suppose that there is a state u> with tt(uj) > 0. 
Define 



Then the time until {Xg n } hits to from any starting state x is majorized by 
a geometric random variable with success probability tc(u)/2. This implies 
that the subsampled chain is uniformly ergodic, as claimed. F as defined 
in (13) depends upon knowledge of it, however, and we obviously do not 
have this available to us (it is the distribution from which we are trying to 
sample!). This example shows that adaptive subsampling can have drastic 
effects on X. However, construction of a domCFTP algorithm for X using 
this subsampling scheme (in a manner to be described in Section 3.3) turns 
out to be impossible unless X is itself uniformly ergodic. 

Reverting to the previous discussion, suppose that there is an explicit 
adaptive subsampling scheme such that the chain X' = {Xg n } satisfies Con- 
dition GE with drift parameter j3 < e -1 . Then a candidate dominating pro- 
cess D can be produced for V{X) in the following way. Begin with an expo- 
nential queue workload process Y that dominates V(X') (as in Section 2). 
Then slow down Y by generating pauses using some convenient function S 
satisfying S(z) > F(z') whenever z > z' , to produce the process D. That is, 
given Dq = Yq = z, pause D by setting 



(12) 



e n+1 = e n + F(v(x 8n )), 



(13) 




D 1 = D 2 



D S(z)-l = z. 
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Then define the law of -D,S(z) by £{D s( z )\F> s(z)-i = z) = C{Y\\Yq = z). It- 
eration of this construction leads to a sequence of times {<7 n } at which D 
moves, defined recursively by 

Cn+i = cr n + S(D Un ), 

with D constant on each interval of the form [cr n ,a n +i). 

Such a process D is a plausible candidate for a dominating process. To 
be suitable for use in a domCFTP algorithm, however, it must be possible 
to compute its equilibrium distribution. Now, D as we have just defined it 
is only a semi-Markov process: it is Markovian at the times {cr n }, but not 
during the delays between jumps. To remedy this, we augment the chain by 
adding a second coordinate N that measures the time until the next jump 
of D. This yields the Markov chain {(D n ,N n )} on [0, oo) x {1,2,...} with 
transitions controlled by 

F[D n+1 = D n ,N n+1 = N n - l\D n , N n ] = 1 

P[D n+1 G E\D n = z,N n = 1] = p[Yi € E\Y 

P[N n+1 = S(D n+1 )\D n ,N n = l,D n+1 ] = 1. 

Using the standard equilibrium equations, if tx is the equilibrium distribution 
of (D,N), then 

7T (z, 1) = 7r(z, 2) = • • • = 7r(z, S(z)) 

and thus ttd(z) =tt(z,-) oc tty (z)S(z). Hence, the equilibrium distribution 
of D is the equilibrium of Y reweighted using S. It is a classical probability 
result [10] that under stationarity the number of people in the D/M/l queue 
(used in the construction of Y) is geometric with parameter rj, where rj is 
the smallest positive root of 

i 1 = l3 1 ~'i. 

(Note that < r\ < 1 since (3 < e _1 .) Thus the equilibrium distribution of 
the queue workload U is exponential of rate (1 — rj). Since Y oc exp(C7), the 
equilibrium density of Y, Try, satisfies 

(14) tcy{z) cx z-P-ri. 
Reweighting Y using S yields the equilibrium density of D, 

(15) 7r D (z)ocS(z)z-^\ 

A suitable pause function S must therefore satisfy S(z) < z x ~ r> in order to 
obtain a probability density in (15). The dominating process constructed 
in the proof of Theorem 16 requires F < S and hence this imposes the 
restriction F(z) < z 1 ^; in particular, this means that F(z)/z ^0as2^ oo. 



if N n > 2; 

for all measurable E C [1, oo); 



PERFECT SIMULATION AND POSITIVE RECURRENCE 



11 



3.2. Tame and wild chains. The above discussion motivates the following 
definition of a tame chain. We write \z] to denote the smallest integer greater 
than or equal to z. 

Definition 14. A Markov chain X is tame with respect to a scale func- 
tion V if the following two conditions hold: 

(a) there exists a small set C := {x:V(x) < d'} and a nondecreasing 
taming function F : [1, oo) — ► {1, 2, . . .} of the form 

\Xz s ], z>d', 
1, z < d', 



(16) F{z) 



for some constants A > 0, 5 G [0, 1) such that the chain X' = {XQ n } possesses 
the drift condition 



(17) PV < (3V + b'l 



a 



where {9 n } is an adaptive sampling scheme defined using F, as in (12); 
(b) the constant (3 in inequality (17) satisfies 

(18) log/3<5- 1 log(l-5). 



We say that X is tamed [with respect to V) by the function F. We may 
simply say that X is tame, without mention of a specific scale function. A 
chain that is not tame is said to be wild. 

Thus a tame chain is one for which we can exhibit an explicit adaptive 
subsampling scheme using a power function F and for which the subsampled 
chain so produced is geometrically ergodic with sufficiently small (3. 

Note that all geometrically ergodic chains are trivially tame: if X satisfies 
Condition GE(V, f3, b, C), then X is tamed by the function 

F(z) = k for z > sup yeC V(y) , 

for any integer k > 1 + l/log/3. 

Definition 14 is strongly motivated by the discussion in Section 3.1. From 
(16), we see that F produces a simple adaptive subsampling scheme, as 
in Definition 13. F is also a nondecreasing function, which accords with 
our intuition; if V{X) is large, then we expect to wait longer before sub- 
sampling again, to create enough drift in the chain to produce a geometric 
Foster-Lyapunov condition. Requirement (b) of Definition 14 is made for 
two reasons. First, it ensures that (3 < e~ l and so ensures ergodicity of the 
D/M/l queue workload U used in the construction of Y. Second, it ensures 
that the weighted equilibrium distribution of Y using S (as described at the 
end of Section 3.1) is a proper distribution; this will be shown in the proof 
of Theorem 16. 
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Kendall [14] shows that a dominating process exists for V(X') even if 
(5 > e , but recall that this involves a further subsampling of X' with a fixed 
period k. Here, < e" 1 is made a requirement of the adaptive subsampling 
process to avoid this situation, since further subsampling of X' would result 
in a composite nondeterministic subsampling scheme. 

The main theorem of this paper is the following. 

Theorem 15. Suppose that X is tame with respect to a scale function V . 
Then there exists a domCFTP algorithm for X using a dominating process 
based on V . 



Theorem 15 is true for all geometrically ergodic chains by the result of [14]. 
As with the results of [7] and [14], this algorithm may not be implementable 
in practice. The proof of Theorem 15 results directly from Theorem 16 and 
the discussion in Section 3.3 below, where a description of the domCFTP 
algorithm is given. 

Theorem 16. Suppose that X satisfies the weak drift condition PV < 
V + blc and that X is tamed with respect to V by the function 



F(z) 



\Xz 5 ], z>d', 
1, z<d', 



with the resulting subsampled chain X' satisfying a drift condition PV < 
(3V + t/lty<d']> with log/3 < <5 _1 k>g(l — 5). Then there exists a stationary 
ergodic process D which dominates V(X) at the times {o~ n } when D moves. 

Proof. We shall construct a Markov chain (D,N) by starting with a 
process Y and pausing it using a function S, to be defined shortly. Before 
beginning the main calculation of the proof, we define some constant. These 
are determined explicitly from the taming function F and the drift condi- 
tions satisfied by X and X' . First, choose (3* > j3 such that 

(19) log/3 < log/3* <5- 1 log(l-(5). 

(That this is possible is a result of the definition of tameness.) Then set 

a = r ^(l + 6(A + l)) ) 

d* = mm{z > d' : (/3* - f3)z > b{\ + l)z 5 + a}, 
b* = b(X + l)d* 5 + a, 



(20) S(z) 
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Finally, consider the set C* = {x: V(x) < h*}. As a sublevel set, C* is m- 
small for some integer m > 1 . We are now in a position to define the function 
S: 

(mV F(h*))\Xz s ], z>h*, 
(mVF(/i*)), z<h*. 

Note that F(x) < S(z) for all x < z (since h* > d') and that 

(21) S(z) >m\/F(h*) >m for all z > 0. 

Define the process Y = h* exp(U) , where U is the system workload of a 
D/M/l queue with arrivals every log(l//?*) time units and service times 
being independent and of unit exponential distribution. Positive recurrence 
of U follows from (19). Pause Y using S (as described on page 9) and call 
the resulting process D. The stationary distribution of D, as shown at the 
end of Section 3.1, is given by 

ir D (z)<xS(z)z-^ 

(22) 

xz -{2-n-S) (i orz> h*), 
where rj < 1 is the smallest positive solution to the equation 

Now, by our choice of (3* above, we have 

(1 - ri)' 1 log?? = log/?* < 5~ l log(l - 5), 

so r] < 1 — 5. Hence, 2 — 77 — S > 1, so we see from (22) that ttd is a proper 
density. 

Suppose that (D an ,N an ) = (z,S(z)) and that V(X a J = V{x) < z. We 
wish to show that D an+1 dominates V(X an+1 ), where a n+ i = a n + S{z) is 
the time at which D next moves. Domination at successive times {<7j} at 
which D moves then follows inductively. For simplicity in the calculations 
below, we set a n = 0. 

Let {On} be the adaptive subsampling scheme for X defined recursively 
by the taming function F. Define a region R(z) C X x Z + to be the so-called 
"short sampling" region: 

R(z) = {(y,t):F(V(y)) + t>S(z)}. 

In other words, once the chain {Xg n ,8 n } hits the (deterministic) region R(z) 
(at time 6j, say), the next subsampling time = Oj + F(V(Xg.))] will lie 
beyond the time S(z) at which the dominating process moves (see Figure 1). 
Define 

T(z)=mm{9 n :(X en ,e n )eR(z)} 
to be a stopping time for X and define 

T'(z)=mm{n:(Xg n ,e n )eR(z)} 
to be the associated stopping time for X' . [Note that T'(z) > 1 since V(x) < 
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<IEx- 



z implies that F(V(x)) < S(z).] 

Our aim is to control Ex[^(^s , (2))] ) recalling that V > 1 and that S(z) is 
deterministic: 

E x [V(X s{z) )]=E x [E Xt( JV(X s(z) )]] 

j=T(z) 

using the weak drift condition of the theorem 

< E x [V(X Tiz) )] + bE x [(S(z) - T(z))\ 

(23) <E x [V(X T{z) )}+bE x [F(V(X T{z) ))} 

since S(z)-T(z)<F(V(X T(z) )), 
by the definition of R(z) 

< E X [V(X T(Z) )} + b(X + 1)E X [V(X T(Z) ) S ] 

by the definition of F. 

Now, the chain X' = {Xg n } is geometrically ergodic (since X is tamed by 
F), so Lemma 6 tells us that 

(24) E X [V{X T{Z) )]=E X [V{X' TI{Z) )} < (3V{x) + 
Furthermore, Lemma 5 yields 



E X [V{X T{Z) ) & \=E X [V{X' T 



'(*)> 



(25) 



<(3 s V s (x) + 



U \S 



1-/3 



l [V(x)<d'[ 



< V s (x) + 



1-/3 




S(z) t 

Fig. 1. Depiction of the region R(z). 
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Combining equations (23), (24) and (25) and making use of the constants 
denned at the start of this proof, we obtain 

E x [V(X s{z) )} < (3V(x) + b(X + l)V 5 (x) + a 

(26) 

<P*V(x)+b*l [v{x) ^ ] . 

Thus a geometric drift condition holds at time S(z) for all chains V(X) 
with starting states x satisfying V{x) < z. As in the proof of Theorem 12, it 
follows from inequality (26) that V(X S r z \) can be dominated by Dgt % \ [17]. 
□ 

Note that questions 1 and 3 at the start of Section 3 have now been an- 
swered: we have defined what is meant by an adaptive subsampling scheme 
and shown that if this takes a particular (power function) form then a sta- 
tionary process D that dominates V(X) at times {cr n } can be produced. 

3.3. The domCFTP algorithm for tame chains. In this section, we de- 
scribe the domCFTP algorithm for tame chains and hence complete the 
proof of Theorem 15. We begin this by answering question 4 of page 8, by 
showing how to simulate (D, N) in equilibrium and in reversed time. Fur- 
thermore, this simulation is quite simple to implement when the function S 
is of the form (20). 

The first point to make here is that one can easily simulate from ttd using 
rejection sampling [21]: using (15), for some constant 7 > 0, we have 

fl\Xz s ]\ 1 
= jp(z)g(z), 

where p(z) £ [1/2, 1] and g(z) is a Pareto density (since 2 — r\ — 6 > 1, as in 
the proof of Theorem 16). Now, given Do = zq as a draw from ttd, set No := 
no, where no ~ Uniformjl, 2, . . . , S(zq)}. It follows from the construction of 
(D,N) in Section 3.1 that (Do, No) ~ fr, as required. 

The chain (D,N) is then simple to run in reversed time using the facts 
that the jumps of D are those of the underlying exponential queue workload 
process Y and that the pause function S is deterministic. (Recall the forward 
construction on page 9 and see Figure 2. More details can be found in [3].) 

We now show that D is a dominating process for X (at the times when 
D moves) based on the scale function V , with threshold h* (recall Defi- 
nition 11). Also, recall from the proof of Theorem 16 that the set C* = 
{x : V(x) <h*} is m-small. 

First, the proof of Theorem 16 shows that the link between stochastic 
domination and coupling [17] may be exploited to couple the various X x ' a ~ M 
with D such that for all n< M, 

(27) V(X^-**)<D a _ n => V{X2°-» )<D a _ (n _ iy 



16 S. B. CONNOR AND W. S. KENDALL 

We now turn to part (b) of Definition 11. Since C* is m-small, there exists 
a probability measure v and a scalar e € (0, 1) such that for all Borel sets 
B C [1, oo), whenever V(x) < h*, 

P[V(X m )EB\X = x]>ev(B). 

Therefore, since S(h*) > m [as noted in (21)], 

nV(X s{h *)) G B\X = x]> eP^- m (B), 

so C* is S(h*) -small. Furthermore, the stochastic domination which has 
been arranged in the construction of D means that for all u > 1, whenever 
V(x)<y, 

[V(X s{y) ) > u\X =x]< p[Fi > u\Y = y]. 

We can couple in order to arrange for regeneration if a probability measure 
v can be identified, defined solely in terms of P^ h ' m and the dominating 
jump distribution P[Yi > u\Yq = y], such that for all u > 1, whenever V{x) < 

y, 

nV(X s{y) ) > u\X = x}- eP^- m ((u, oo)) 
< p[Ti > u\Y =y]- ez>((u, oo)) 

P^*)- m (( U ,oo))<z>((n,oo)); 

and 

F[Yl € E\Y = y]> ev(E) 

for all measurable E C [1, oo). 

Recall the following result, a proof of which is provided in [14]. 

Lemma 17. Suppose that U , V are two random variables defined on 
[1, oo) such that: 



i 

6- 


k 

■ S(z ) --■) 


z -1 ; 

e--- S(z.,) 

1 




I-- -o 



Fig. 2. Construction of D in reversed time. 
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(a) The distribution C(U) is stochastically dominated by the distribution 
C(V), that is, 

F[U > u] < f[V > u] for all u>l; 

(b) U satisfies a minorization condition: for some (3 G (0, 1) and proba- 
bility measure ip, 

P[U £E}> 0ip(E) for all Borel sets E C [1, oo). 

Then there exists a probability measure [i stochastically dominating ip and 
such that (3fi is minorized by C{V). Moreover, fi depends only on f3ij) and 
C(V). 

Therefore, using Lemma 17, £(X a _, 1 < \X a _ n = x) may be coupled to 
C(D (T _^ n _ 1) \D a _ n = y) whenever V(x) < y, in a way that implements stochas- 
tic domination and ensures that all of the X„ , can regenerate simulta- 

u — (n— 1) ° 

neously whenever D (T _ n < h* . 

Finally, it is easy to see that part (c) of Definition 11 is satisfied: the 
system workload U of the queue will hit zero infinitely often and therefore 
D will hit level h* infinitely often. 

We can now describe a domCFTP algorithm based on X which yields a 
draw from the equilibrium distribution. 

Algorithm. 

• Simulate D, as a component of the stationary process (D,N), backward 
in time until the most recent cr„j\/ < for which D a _ M < h*; 

• while coalescence does not occur at time <t_m, extend D backward until 
the most recent <t„j\// < ct^m for which D a _ , < h* and set M <— M'; 

• starting with the unique state produced by the coalescence event at time 
o~—]\j simulate the coupled X forward at times <t_m, ct_(m-i), ct_(m-2), ■ ■ •> 
up to and including time cr_i; 

• run the chain X forward (from its unique state) from time cr_i to (see 
Figure 3); 

• return Xq as a perfect draw from equilibrium. 

Lemma 18. The output of the above algorithm is a draw from the sta- 
tionary distribution of the target chain X . 

Proof. The stochastic domination of (27) and Theorem 2.4 of [17], 

(n) 

Chapter IV guarantee the existence of a joint transition kernel P% d that 
provides domination of X by D and such that the marginal distributions of 
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X and D are correct. That is, for x < y, with n = S(y), for all z > 1, 

pS(x, y; y- 1 ((z,oo)),[i,z]) = o, 

P^(a;,y; !n,^)=4 ri) (x;y- 1 ([l,z])), 



V-i([l,z])Jl 

PP D {x,y-du,dv) = P ( g\y- [l,z]). 

ix Ji 

The chains X and D (run forward) may therefore be constructed in either 
of the following two ways. 

1. Given D a _ m and X a _ m < D a _ m , with n = S(D a _ m ): 

• draw D a _,_ 1 , from the probability kernel 

• draw X a _,_ 1 . from the regular conditional probability 



• draw X a _ m+ i, X a _ m +2, ■ ■ ■ , X a _ {m _ 1) _i as a realization of X condi- 
tioned on the values of X a _ m and X a _, m _ X) (i.e., as a Markov bridge 
between X a _ m and X fT _ (m _ 1) ). 

2. Given D (J _ m and X CT _ m < D a _ m , with n = S(D a _ m ): 

• draw A" CT _ m+ i, X cr _ m+ 2, . . . , X a _,_ 1 < using the normal transition kernel 
for X, noting that the distribution of X a _,_ 1 . is exactly the same as 

if it were drawn directly from P^\x a _ m ] •); 



i-9 



- 1 — 

0-1 







Fig. 3. Final stage of the domCFTP algorithm: D (black circles •) dominates V(X ) (red 
triangles k.) at times {<r„}. To obtain the draw from equilibrium, Xq, X can be run from 
time cr_i to without reference to D after time o~-i. 
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• draw D a _,_ 1 ^ from the regular conditional probability 

^D|{X}('l^ CT -"i'^ f7 -m'^°"-m + 1 ' 1 ' 1 >^ cr -(m-l)) 
_ -Pjr.DP^-m ' -^Q--mi -^O--(m-l) ' ') 

Px \x a _ m ;X cr _ (m _ l) ) 

Each of these two methods produces chains X and D which satisfy the 
stochastic domination of (27). Method 1 is that which is effectively used 
by the algorithm, although there is no need for the final superfluous step 
(the Markov bridge) when implementing the algorithm. Method 2, however, 
makes it clear that X has the correct Markov transition kernel to be the re- 
quired target chain. Furthermore, the equivalence of the two schemes proves 
the validity of the final step of the algorithm, where the chain X is run from 
time cr_i to without reference to D. 

Finally, the proof that the algorithm returns a draw from equilibrium 
follows a standard renewal theory argument. Consider a stationary version 
of the chain X, say X , run from time — oo to 0. The regenerations of X 
(when it visits the small set C*) and those of D (when it hits level h*) form 
two positive recurrent renewal processes (with that of X being aperiodic). 
Therefore, if D is started far enough in the past, then there will be a time 
— T at which both X and D regenerate simultaneously. Now, consider the 
process X n = X n \\ n <-T\ +X n lr n >_r]. Clearly, X is stationary and follows 
the same transitions of X from time — T to 0. Thus Xq = Xq ~ ir, so the 
output as the algorithm is indeed a draw from the required equilibrium 
distribution. □ 

This concludes the proof of Theorem 15. We have produced a domCFTP 
algorithm based on the scale function V for the tame chain X . 

3.4. When is a chain tame? As a consequence of Theorem 15, question 
2 of page 8 can be rephrased as: when is a chain tame? Note that a tame 
chain will not necessarily be tamable with respect to all scale functions, of 
course. 

In this section, we present an equivalent definition of tameness and prove 
some sufficient conditions for a polynomially ergodic chain to be tame. The 
following theorem shows that tameness is determined precisely by the be- 
havior of the chain until the time that it first hits the small set C. 

Theorem 19. Suppose that X satisfies the weak drift condition PV < 
V + blc- Then for n(x) =o(V(x)), the following two conditions are equiva- 
lent: 
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(i) there exists (3 G (0, 1) such that E x [V(X n ^)] < j3V{x) for V(x) suf- 
ficiently large; 

(ii) there exists (3' G (0, 1) such that E x [V(X n ^ ATC )] < (3'V(x) for V(x) 
sufficiently large. 

Furthermore, if V(x) is sufficiently large, we may take \(3 — @'\ < e for any 
e>0. 

Proof. Since C = {x: V(x) < a 7 } is a sublevel set, we can split the ex- 
pectation of V{X n ^ hTC ) according to whether tq < n{x) or not, to show 
that 

E x [V(X n(x)Arc )] < siipV(y)+E x [V(X n{x) y,T C > n(x)] 

< S upV(y)+E x [V(X n{x) )], 
y&c 

so (i)^(ii). 

We now prove the reverse implication. Using the weak drift condition for 
X and recalling that n(x) is deterministic, we have 

n(x) 

E x [V(X n{x) );T C < n(x)} = £ E x [E Xk [V(X n{x) _ k )];T C = k] 

k=l 

n(x) 

< supE y [V(X n{x) _ k )\X k = y}f x [r c = k] 
k=i y^ c 

n(x) 

< Y,sup(V{y) + b(n{x)-k))F x [T C = k} 
k=i y^ c 

<d + n(x)b. 
Assuming (ii), we therefore have 

E x [V(X n{x) )] < E x [V(X n{x)Arc )]+E x [V(X n{x) );Tc < n(x)} 
<(3'V(x) + d + n(x)b 
< PV(x) 

for all sufficiently large V(x), since n(x) =o(V(x)). 

Finally, due to the restriction on the size of n(x), it is clear that f3 and 
(3 1 may be made arbitrarily close by simply restricting attention to x for 
sufficiently large V{x). □ 

Suppose that we now modify the behavior of a tame chain X when it is 
in the small set C. The following simple corollary of Theorem 19 shows that 
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provided the resulting chain still satisfies a weak drift condition, tameness 
is preserved under such modification. 

Corollary 20. Suppose X satisfies the drift condition PV < V + blc 
and that X is tamed by the function F to produce a chain X' satisfying 
GE(V, f3,b' ,C). Let X be a new chain produced by modifying the behavior 
of X when in C , such that X satisfies PV < V + blc- Then F also tames 
X, and the resulting chain X' satisfies GE(V, (3,b' ,C') for any p £ (f3, 1). 

Proof. Write F x = F(V(x)). Since X is tame, Theorem 19 tells us that 
for V(x) sufficiently large, 

E x [V(X FxAtc )]<PV(x) 

for any (3 £ 1). Now, since 

X1 [t c >F x ] = X1 [t c >F x ], 

by definition, 

E x [V(X FxAfc )]<pV(x). 

Furthermore, since X satisfies the drift condition PV < V + 61c*, a second 
application of Theorem 19 yields 

E x [V(X Fx )]<PV(x), 

where (3 E ((3, 1) may be chosen arbitrarily close to (3 (and hence to f3). Thus 
the same function F also tames X. □ 

We have already remarked that all geometrically ergodic chains are tame. 
The next two theorems provide sufficient conditions for a polynomially er- 
godic chain to be tame. 

Theorem 21. Let X be a chain satisfying a drift condition PV < V — 
cV a + blc f or which V{X) has bounded upward jumps whenever X ^ C . 
That is, V{X\) < V{Xq) + K whenever Xq ^ C , for some constant K < oo. 
Then X is tame. 

Proof. From Theorem 19, we see that it is sufficient to show that by 
choosing an appropriate taming function F, we can obtain the bound 

(28) E x [V(X F(vix)) y,F(V(x))<Tc}<(3V(x) + b'l C '(x). 
Choose (3 sufficiently small to satisfy 

(29) log/3< (1 -a)" 1 logo 
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and then choose A sufficiently large so that A" 1 < (3c(l — a). Define the 
constant d\ by 



c(l — a)A V c(l — a)\, 
and define C\ = {x : V(x) <d\). Note that if x ^ C±, then 



(30) 



Finally, set d! = max{d, d\} and let C = {x : V{x) < d'}. 
Now, define the taming function F by 

^^"1 1, for z < d' . 

Write = -F(y(:r)) to ease notation. Then for x ^ C , since the upward 
jumps of V(X) before time tq are bounded above by K, we have 

E x [V(X Fx );F x < t c ] < (V(x) + KF X ) F x [t c > F x ] 

< (V(x) + KF x ) Ex ™ by Markov's inequality 

Fx 

< (V(x) + KF X ) J 1 by Corollary 9 

c(l - a)F x 

^ n (X \x + ( n ~ u ) Fl ~ Q ^ using ^ 31 ) 

c(l — a) A \c(l — a)A/ 

< /3V(a;) by inequality (30). 
Finally, for x € C", we have 

E^fe)] = E^FCXi)] < V(x) + 6 
< pV(x) + (1 - + b 
= (3V(x) + b', 

where b' = (1 — + 6 < oo. Hence, (28) is satisfied for all x and X is tame. 

□ 



The following proof makes use of Proposition 10, which was borrowed 
from [11]. Note that tameness is clearly monotonic in the drift exponent a 
since chains satisfying PE(V, c, a, b, C) also satisfy PE(V, c, a', b, C) for all 
a' < a. 

Theorem 22. Let X be a chain satisfying the drift condition PV < 
V — cV a + blc, with a > 3/4. Then X is tame. 
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PROOF. Let p = (1 - a) 1 /2 > 2 and set a' = 2a - 1. Writing V p = 
yi-p(i- Q ) — y 1 / 2 anc j us i n g Lemma 8, we have 

PV p <V p -Vf + bxl c 

for some b\ < oo. We shall seek a time change that produces a geometric 
Foster-Lyapunov condition on this scale, V„. As in the proof of Theorem 21, 
we simply need to control 



E x [V p (X Fx );F x <t c ], 



where F x = F(V p (x)). 
By Proposition 10, 



J2 nP- l V p (X ri 

n=0 



CE X 

for some constant M < oo. Thus 
(32) E^AVJ^^tc]^ 



< M7(i) 
M7(i) 

Fr 1 ' 



Now, choose (3 > such that log/3 < (p — 1) log((p — 2)/(p — 1)) and define 
the taming function i 7 by 

F(z) = [(Az) 1/(p_1) ]Vl 
for any A > M/f3. Then from inequality (32), 

for Vp(a;) sufficiently large. Therefore, i 7 tames X, as required. □ 



In fact, it turns out that any chain satisfying drift Condition PE may be 
adaptively subsampled as above to produce a geometrically ergodic chain 
(see [2] for details). However, for a < 3/4, the pause function produced 
leads to an improper equilibrium distribution for the dominating process 
of Theorem 16. Connor [2] shows how this lower bound on a may be further 
reduced to 0.704, but tameness for a < 0.704 remains to be proven. This is 
not to say, of course, whether or not there may exist another suitable pause 
function, possibly on a different scale. 

These two sufficient conditions are not necessary for a chain to be tame: 
in Section 4.4, we present an example of a chain that satisfies Condition PE 
with drift coefficient a = 1/2 and which does not have bounded jumps for 
X ^ C, and we show explicitly that it is tame. 
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4. Examples. We now present four explicit examples of polynomially 
ergodic chains and show that they are tame. The first two of these are tame 
by Theorem 21 and the third by Theorem 22. The final example, (4.4), shows 
that the sufficient conditions of Theorems 21 and 22 are not necessary for 
X to be tame. 

4.1. Epoch chain. Consider the Markov chain X on {0, 1,2,.. .} with the 
following transition kernel: for all x € {0, 1,2,.. .}, 

P(x,x) = 9 x ; P(0,x) = ( x ; 
P(x,0) = l-e x . 

Thus X spends a random length of time (an epoch) at level x before jumping 
to and regenerating. Meyn and Tweedie ([18], page 362) show that this 
chain is ergodic if Q x > for all x and 

(33) ^G(l-^) _1 <oo. 

X 

Furthermore, they show that the chain is not geometrically ergodic if 9 X — > 1 
as x — ► oo, regardless of how fast Q x — ► 0. 

Now, suppose that X = 1 — k(x + 1)~ A for some suitable k, A > 0. We 
now slightly strengthen condition (33) on {Cr} to obtain a polynomial drift 
condition: we require that there exists e > such that J2xCxX^ 1+£ ^ x < oo. 

Let C = [0, n 1 ^]. Then following drift condition holds: 

(34) E^(Xi)] < V(x) - KV a (x) + bl c (x), 

where V(x) = (x + l) m , m = (1 + e)A and a = e/(l + e). This chain then 
satisfies the conditions of Theorem 21 and is therefore tame. 

4.2. Delayed death process. Consider the Markov chain X on {0, 1, 2, .. .} 
with the following transition kernel: 

P(x,x) = 9 x , x>l, 

p(x,x-i) = i-e x , x>i, 

P(0,x) = Cx>0, xe {0,1,2,...}, 

where 9 X = 1 — k(x + 1)~ A for some suitable k > 0,A > 1, and Cx — > as 
x — > oo sufficiently fast to ensure that 

OO X 

EoN = 1 + E Cx 5Z(1 - y y l < oo, 
x=l y=l 

making X ergodic. 

It is simple to show that X is not geometrically ergodic, but that it does 
satisfy Condition PE(V, c, a, b, C) with V(x) = {x + 1) 2A and a = (A - 1) /2A. 
Since the upward jumps of V(X) are clearly bounded for X > 1, the chain 
is tame by Theorem 21. 
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4.3. Random walk Metropolis-Hastings. For a more practical example, 
consider a random walk Metropolis-Hastings algorithm on ]R rf , with pro- 
posal density q and target density p. Fort and Moulines [6] consider the 
case where q is symmetric and compactly supported and logp(z) ~ — \z\ s , 
< s < 1, as \z\ — ► oo. (When d= 1, this class of target densities includes 
distributions with tails typically heavier than the exponential, such as the 
Weibull distributions; see [6] for more details.) They show that under these 
conditions, the Metropolis-Hastings algorithm converges at any polynomial 
rate. In particular, it is possible to choose a scale function V such that the 
chain satisfies Condition PE with a > 3/4. Therefore, by Theorem 22, this 
chain is tame. 

4.4. Random walk on a half-line. For our final example of a tame chain, 
we consider Example 5.1 of Tuominen and Tweedie [23]. This is the random 
walk on [0, oo) given by 

(35) X n+ \ = (X n + Z n+ i) + , 

where {Z n } is a sequence of i.i.d. real- valued random variables. We sup- 
pose that K[Z] = —fj, < (so that is a positive-recurrent atom) and that 
E[(Z + ) m ] = fi m < oo for some integer m > 2. 

We also assume that E[?^ + ] = oo for all r > 1, and claim that this forces 
X to be subgeometrically ergodic. To see this, consider the chain X which 
uses the same downward jumps as X but which stays still when X increases. 
That is, 

X n+ \ = (X n - Z n+l ) + . 

Let To be the first time that X hits and let fo be the corresponding hitting 
time for X . Note that for all n > 0, 

(36) Ex- [X nA r ] >x-W.x[nAT ]fi, 

where ft := — E[Z; Z < 0] > 0. Now, the left-hand side of (36) is dominated 
by x, and Ex[to] < oo, so letting n — > oo yields 

(37) Ex [r ] >E x [f ] >x/fi. 
Thus, for r > 1, 

Eo[r TO ]=rE [Ex 1 [r T0 ]] 

> rE [r Ex i [ro1 ] 

> rEoI^^ 1 ^] = 00 ^ assumption. 
Therefore, by Theorem 3, X is not geometrically ergodic. 
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Now, [11] show that if m > 2 is an integer, then X satisfies Condition 
PE with V{x) = (x + l) m and a = (m — l)/m. Clearly, the upward jumps 
of V(X) when X ^ C are not necessarily bounded, so Theorem 21 cannot 
be applied. Furthermore, if m < 4, then a < 3/4, so Theorem 22 cannot be 
applied. However, we now show that X is still tame when m = 2 (and thus 
tame for all m > 2). 

(i) First, assume that the law of Z is concentrated on [— 2:0,00) for some 
zq > 0, so [i 2 = E[(Z + ) 2 } < 00. Then if x > z , 

Ex [(X 1 + l) 2 ]=E[(x + l + Z) 2 ] 

= (x + l) 2 + 2(x + 1) E[Z] + E[Z 2 } 

<(x + l) 2 - 2fi(x + 1) + (H2 + zl). 

Thus for any < j3 < 1, there exist z# > zo an d bp < 00 such that, with 
y( x ) = ( rE + l) 2 and a = 1/2, 

(38) Ex[^(^)] < V(x) - (2 - /%F a (x) + 6^1^]. 

Assume that (3 < 1/4 and a corresponding > zo are fixed. Write Cg = 
[OjZp] and for > zp, define F(V(a;)) = \V 1 / 2 (x) / Iterating the drift 
condition (38), we obtain for x £ C, with F x = F(V(x)), 

E X [V(X F J] < V(x) - (2 - /3) fi MV 1,2 {Xk)] + b p F x 

k=0 
F x -l 

<(x + l) 2 - (2 - 0)fi J2 ( x + 1 - M + h F * 

k=0 

(39) ^ /since , \V 1/2 ($*)Wf'A( X hf !)]>* + !- ^» 

< f 1 - (2 - J3) + V 2 jfc + 1) + 7^ 



for some 7 > 0, 
< ^V{x) + 1 V 1 ' 2 {x). 
Thus there exists a sublevel set C and a constant b' < 00 such that if 



1 j "li, xeC, 



then we obtain 

E..[F(X F J]</3y(x) + 6'l c ,(^) 

with (3 < 1/4. Since a = 1/2, we satisfy log/3 < (1 — a) -1 loga, so this chain 
is indeed tame. 
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(ii) In the general case, we can proceed by truncating the law of Z at a level 
— zo so that the truncated distribution has a negative mean. The resulting 
chain, X* say, is tame by the above argument. However, X* stochastically 
dominates X on the whole of [0,oo), so X must also be tame. 

A polynomial drift condition can still be shown to hold when m £ (1,2) 
[corresponding to drift a £ (0, 1/2)]. Furthermore, it is quite simple to pro- 
duce an adaptive subsampling scheme in this situation that produces a chain 
satisfying condition GE(V, (5, b, C). However, it is also necessary to make (5 
sufficiently small to satisfy part (b) of Definition 14 and we have not yet 
been able to achieve this. Therefore, it is unclear at present whether such 
chains are in fact tame. 

5. Conclusions and questions We have introduced the concept of a tame 
Markov chain and shown that a domCFTP algorithm exists for all such 
chains. This algorithm is not expected to be practical in general, but it 
directly extends the results of [7] and [14]. In a practical setting, of course, 
one would use a dominating process that is better suited to the chain of 
interest. We have proven two sufficient conditions for a polynomially ergodic 
chain to be tame and provided an example which demonstrates that neither 
of these sufficient conditions are necessary. 

Our suspicion, which is shared by those experts with whom we have dis- 
cussed this, is that the following conjecture is true. 

Conjecture 23. There exists a chain satisfying Condition PE which 
is wild. 

On the other hand, we do not rule out the possibility that all polynomially 
ergodic chains are tame. A resolution of this conjecture would do much 
to further our understanding of such chains. The tame/wild classification 
provides some structure to the class of subgeometrically ergodic Markov 
chains that goes beyond the rate at which they converge to equilibrium. 
Although purely theoretical at present, this may prove to be important 
in understanding elaborate MCMC implementations: for a tame chain, the 
existence of a time change which produces a geometrically ergodic chain 
could possibly be exploited to improve the behavior of an MCMC algorithm. 

It is also natural to ask what can be said about the more general case of 
subgeometric ergodicity. The drift condition 

(40) PV <V -c/)oV + bl c 

[where <j) > is a concave, nondecreasing, differentiable function with 4>'{t) — > 
as t — > oo] is a generalization of (7) which can be shown to imply subge- 
ometric ergodicity [5]. Much of the work in this paper extends naturally 
to chains satisfying this drift condition (see [2] for details). However, it is 
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possible to produce a version X of the Epoch chain of Section 4.1 that sat- 
isfies (40) but not (7). Furthermore, no subsampling scheme defined using a 
function F of the form (16) will result in a geometrically ergodic chain, so 
this X is wild. The existence of a perfect simulation algorithm for this and 
similar chains is also an open question. 
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